-
-
Notifications
You must be signed in to change notification settings - Fork 754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ICU-21254 Add plural rule parsing for exponent operand in C++ #1286
Conversation
I wanted to get this PR out as soon as I could, so I figured it might be better to ask theses few smaller, basic issues that I had over the PR. Now that we have the PR:
|
Looks good to me, Shane may have a more informed opinion.
Long lines seem to be fine in ICU source code. But you are looking to break up a long C string "aaabbbcccddd" you can always use string concatenation as in |
Locale locale = Locale::createFromName(localeName); | ||
|
||
struct TestCase { | ||
const char* skeleton; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional: It would be more correct to make this char16_t*
(and u""
strings), because number skeletons are 16-bit strings. Currently you are performing an implicit codepage conversion in the NumberFormatter:forSkeleton
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
Can I get a rubber stamp, of sorts? I forced-pushed after amending the branch commit in order to re-trigger the PR check on Jira ticket status and also to make the optional type change that Shane suggested (+ breaking up long lines in source code). |
FYI: The MSVC and GCC MSYS2 CI builds seem to be crashing in the new test (but only in Debug builds).
|
@jefgen Thanks for the heads up, so strange. I got ahead of myself after seeing tests pass locally. I see some MSVC tests pass, whereas these have a segfault. If you happen to have any guesses as to what I'm doing, let me know. |
Sure, let me try to take a look into this in a bit. :) |
Okay, so it looks like the crash is happening on the first test case: Here's the full stack-trace when the test fails:
Debugging a bit, it looks like the problem is actually here:
The creation of the But the test doesn't check the This later causes a null-pointer dereference as |
Ah, found it. -- The problem is the last character, the "…". |
Good find! It looks like PluralRules does expect a "…" (U_2026) rather than "...": I think the correct fix is to use a 16-bit UnicodeString instead of an 8-bit string. Currently you have LocalPointer<PluralRules> rules(PluralRules::createRules(
"one: i = 0,1 @integer 0, 1 @decimal 0.0~1.5; "
"many: e = 0 and i % 1000000 = 0 and v = 0 or e != 0 .. 5; "
"other: @integer 2~17, 100, 1000, 10000, 100000, 1000000, "
" @decimal 2.0~3.5, 10.0, 100.0, 1000.0, 10000.0, 100000.0, 1000000.0, …", errorCode)); However LocalPointer<PluralRules> rules(PluralRules::createRules(
u"one: i = 0,1 @integer 0, 1 @decimal 0.0~1.5; "
u"many: e = 0 and i % 1000000 = 0 and v = 0 or e != 0 .. 5; "
u"other: @integer 2~17, 100, 1000, 10000, 100000, 1000000, "
u" @decimal 2.0~3.5, 10.0, 100.0, 1000.0, 10000.0, 100000.0, 1000000.0, …", errorCode)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. The latest test failures appear to be unrelated; I'm fixing them in #1292.
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
Hooray! The files in the branch are the same across the force-push. 😃 ~ Your Friendly Jira-GitHub PR Checker Bot |
Looks like all the checks passed. Now is finally the time for a re-stamp. |
I almost forgot, @pedberg-icu -- where is the logKnownIssue for ICU-21254? I can't find it in the ICU codebase. Let me know if I need to amend this PR. |
There is no longer a logKnownIsssue for ICU-21254; I realized that the tests I was originally skipping with it were actually broken due to https://unicode-org.atlassian.net/browse/ICU-21258, and changed the logKnownIssue to reference ICU-21258 instead. |
This PR adds the missing functionality in C++ for parsing plural rule strings as a part of supporting the exponent operand in plural rules. The previous exponent operand work was implemented in #938 (Java) and #972 (C++).
As a side note, after adding the logic for the plural rule parser for the 'e' operation in C++, I noticed that the equivalent code for the 'w' operand seems to be missing -- is that intentional?
Checklist